Efficient Super Granular SVM Feature Elimination (Super GSVM-FE) model for protein sequence motif information extraction

نویسندگان

  • Bernard Chen
  • Stephen Pellicer
  • Phang C. Tai
  • Robert W. Harrison
  • Yi Pan
چکیده

Protein sequence motifs are gathering progressively attention in the sequence analysis area. The conserved regions have the potential to determine the conformation, function and activities of the proteins. We develop a new method combines the concept of granular computing and the power of Ranking-SVM to further extract protein sequence motif information generated from the FGK model. The quality of motif information increases dramatically in all three evaluation measures by applying this new feature elimination model. Since the training step of Ranking-SVM is very time consuming, we provide a feasible way to reduce the training time dramatically without sacrificing the quality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovery and Extraction of Protein Sequence Motif Information that Transcends Protein Family Boundaries

Protein sequence motifs are gathering more and more attention in the field of sequence analysis. The recurring patterns have the potential to determine the conformation, function and activities of the proteins. In our work, we obtained protein sequence motifs which are universally conserved across protein family boundaries. Therefore, unlike most popular motif discovering algorithms, our input ...

متن کامل

Granular support vector machines with association rules mining for protein homology prediction

OBJECTIVE Protein homology prediction between protein sequences is one of critical problems in computational biology. Such a complex classification problem is common in medical or biological information processing applications. How to build a model with superior generalization capability from training samples is an essential issue for mining knowledge to accurately predict/classify unseen new s...

متن کامل

Granular Support Vector Machines Based on Granular Computing, Soft Computing and Statistical Learning

With emergence of biomedical informatics, Web intelligence, and E-business, new challenges are coming for knowledge discovery and data mining modeling problems. In this dissertation work, a framework named Granular Support Vector Machines (GSVM) is proposed to systematically and formally combine statistical learning theory, granular computing theory and soft computing theory to address challeng...

متن کامل

Application of latent semantic analysis to protein remote homology detection

MOTIVATION Remote homology detection between protein sequences is a central problem in computational biology. The discriminative method such as the support vector machine (SVM) is one of the most effective methods. Many of the SVM-based methods focus on finding useful representations of protein sequence, using either explicit feature vector representations or kernel functions. Such representati...

متن کامل

A Novel Prediction Method of Protein Structural Classes Based on Protein Super-Secondary Structure

At present, the feature extraction of protein sequences is the most basic issue to predict protein structural classes and is also the key problem to decide the quality of prediction. In order to predict protein structural classes accurately, we construct a 14-dimensional feature vector based on protein secondary and super-secondary structure information to reflect the content and spatial orderi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • I. J. Functional Informatics and Personalised Medicine

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2008